Modeling spontaneous speech variability in professional dictation

نویسندگان

  • Hauke Schramm
  • Xavier L. Aubert
  • Bart Bakker
  • Carsten Meyer
  • Hermann Ney
چکیده

In this work, we present a model combination approach at the word level that aims to improve the modeling of spontaneous speech variabilities on a highly spontaneous, real life medical transcription task. The technique (1) separates speech variabilities into pre-defined classes, (2) generates speech variability specific acoustic and pronunciation models and (3) properly combines these models later in the search procedure on a word level basis. For efficient integration of the specific acoustic and pronunciation models into the search procedure, a theoretical framework is provided. Our algorithm is a general approach that can be applied to model various speech variabilities. In our experiments, we focused on the variabilities related to filled pauses, rate of speech and speaker accent. Our best system combines six variability specific acoustic and pronunciation models on a word level and achieves a word error rate reduction of 13% relative compared to the baseline. In a number of contrast experiments we evaluated the importance of different components in our system and explored ways to reduce the system complexity. 2005 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling spontaneous speech variability for large vocabulary continuous speech recognition

In this work a number of novel techniques for improved treatment of spontaneous speech variabilities in large vocabulary automatic speech recognition are developed and evaluated on US English conversational speech and spontaneous medical dictations. Two main aspects of spontaneous speech modeling are addressed: The general handling of pronunciation variability and the individual and parallel tr...

متن کامل

Transformation-based error correction for speech-to-text systems

We present a universal approach to uncover and correct systematic local errors in complex speech-to-text systems. Whereas previous work to minimize speech recognition errors mostly relies on N-best lists or word lattices, our approach is merely based on the first-best system output. The paradigm of Transformation-Based Learning (TBL) is adapted from tagging-like applications to themore complica...

متن کامل

Modeling Filled Pauses in Medical Dictations

Filled pauses are characteristic of spontaneous speech and can present considerable problems for speech recognition by being often recognized as short words. An um can be recognized as thumb or arm if the recognizer's language model does not adequately represent FP's. Recognition of quasi-spontaneous speech (medical dictation) is subject to this problem as well. Results from medical dictations ...

متن کامل

Structure and annotation of Polish LVCSR speech database

This paper reports on the problems occurring in the process of building LVCSR (Large Vocabulary Continuous Speech Recognition) corpora based on the internal evaluation of the Polish database JURISDIC. The initial assumptions are discussed together with technical matters concerning the database realization and annotation results. Providing rich database statistics was considered crucial especial...

متن کامل

Towards automatic transcription of spontaneous presentations

This paper reports various investigations on recognizing spontaneous presentation speech in connection with the “Spontaneous Speech” national project started in 1999. Presentation speech uttered by 10 male speakers of approximately 4.5 hours duration has been recognized. Experimental results show that acoustic and language modeling based on an actual spontaneous speech corpus is far more effect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2006